Linked Data for Life Sciences

نویسندگان

  • Amrapali Zaveri
  • Gökhan Ertaylan
چکیده

Massive amounts of data are currently available and being produced at an unprecedented rate in all domains of life sciences worldwide. However, this data is disparately stored and is in different and unstructured formats making it very hard to integrate. In this review, we examine the state of the art and propose the use of the Linked Data (LD) paradigm, which is a set of best practices for publishing and connecting structured data on the Web in a semantically meaningful format. We argue that utilizing LD in the life sciences will make data sets better Findable, Accessible, Interoperable, and Reusable. We identify three tiers of the research cycle in life sciences, namely (i) systematic review of the existing body of knowledge, (ii) meta-analysis of data, and (iii) knowledge discovery of novel links across different evidence streams to primarily utilize the proposed LD paradigm. Finally, we demonstrate the use of LD in three use case scenarios along the same research question and discuss the future of data/knowledge integration in life sciences and the challenges ahead.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linked Environment Data for the Life Sciences

Environment Agencies from Europe and the US are setting up a network of Linked Environment Data and are looking to crosslink it with Linked Data contributions from the life sciences.

متن کامل

A Provenance Assisted Roadmap for Life Sciences Linked Open Data Cloud

A significant portion of Web of Data is composed of multiple datasets that add high value to biomedical research. These datasets have been exposed on the web as a part of the Life Sciences Linked Open Data (LSLOD) Cloud. Different initiatives have been proposed for navigating through these datasets with or without vocabulary reuse. The significance of provenance information regarding life scien...

متن کامل

Bio2RDF Release 3: A larger, more connected network of Linked Data for the Life Sciences

Bio2RDF is an open source project to generate and provide Linked Data for the Life Sciences. Here, we report on a third coordinated release of ~11 billion triples across 30 biomedical databases and datasets, representing a 10 fold increase in the number of triples since Bio2RDF Release 2 (Jan 2013). New clinically relevant datasets have been added. New features in this release include improved ...

متن کامل

Bio2RDF Release 2: Improved Coverage, Interoperability and Provenance of Life Science Linked Data

Bio2RDF currently provides the largest network of Linked Data for the Life Sciences. Here, we describe a significant update to increase the overall quality of RDFized datasets generated from open scripts powered by an API to generate registry-validated IRIs, dataset provenance and metrics, SPARQL endpoints, downloadable RDF and database files. We demonstrate federated SPARQL queries within and ...

متن کامل

Improving Discovery in Life Sciences Linked Open Data Cloud

Multiple datasets that add high value to biomedical research have been exposed on the web as part of the Life Sciences Linked Open Data (LSLOD) Cloud. The ability to easily navigate through these datasets is crucial for personalized medicine and the improvement of drug discovery process. However, navigating these multiple datasets is not trivial as most of these are only available as isolated S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Algorithms

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017